A Methodology for Biologically Relevant Pattern Discovery from Gene Expression Data
نویسندگان
چکیده
One of the most exciting scientific challenges in functional genomics concerns the discovery of biologically relevant patterns from gene expression data. For instance, it is extremely useful to provide putative synexpression groups or transcription modules to molecular biologists. We propose a methodology that has been proved useful in real cases. It is described as a prototypical KDD scenario which starts from raw expression data selection until useful patterns are delivered. Our conceptual contribution is (a) to emphasize how to take the most from recent progress in constraint-based mining of set patterns, and (b) to propose a generic approach for gene expression data enrichment. The methodology has been validated on real data sets.
منابع مشابه
Integrative Biomarker Discovery for Breast Cancer Metastasis from Gene Expression and Protein Interaction Data Using Error-tolerant Pattern Mining
Biomarker discovery for complex diseases is a challenging problem. Most of the existing approaches identify individual genes as disease markers, thereby missing the interactions among genes. Moreover, often only single biological data source is used to discover biomarkers. These factors account for the discovery of inconsistent biomarkers. In this paper, we propose a novel error-tolerant patter...
متن کاملPattern Discovery from Biosequences
In this thesis we have developed novel methods for analyzing biological data, the primary sequences of the DNA and proteins, the microarray based gene expression data, and other functional genomics data. The main contribution is the development of the pattern discovery algorithm SPEXS, accompanied by several practical applications for analyzing real biological problems. For performing these bio...
متن کاملContribution to Gene Expression Data Analysis by Means of Set Pattern Mining
One of the exciting scientific challenges in functional genomics concerns the discovery of biologically relevant patterns from gene expression data. For instance, it is extremely useful to provide putative synexpression groups or transcription modules to molecular biologists. We propose a methodology that has been proved useful in real cases. It is described as a prototypical KDD scenario which...
متن کاملبه کارگیری خوشهبندی دوبعدی با روش «زیرماتریسهای با میانگین- درایههای بزرگ» در دادههای بیان ژنی حاصل از ریزآرایههای DNA
Background and Objective: In recent years, DNA microarray technology has become a central tool in genomic research. Using this technology, which made it possible to simultaneously analyze expression levels for thousands of genes under different conditions, massive amounts of information will be obtained. While traditional clustering methods, such as hierarchical and K-means clustering have been...
متن کاملGene set analysis using gene subset information and multiple testing
Motivation: Using gene expression data, biologists are often interested in determining whether some pre-defined sets of genes are differentially expressed under varying experimental conditions. Often each of these sets is a union of biologically important subsets of genes. Several procedures are available for performing gene set analysis but they do not take into account such additional informa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004